vertex pair
Supplementary Material for Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering Communities
Sections A - E prove our main results in detail. We next introduce some notation pertaining to the construction of the correlated SBMs. In this section we prove Theorem 3.1 in the main text. To make these ideas more formal, we begin by defining some notation. Remark B.2. Since our goal is to bound the probability of the event To apply Lemma B.3 later on, we need to compute/estimate This leads to the subtracted term in (5).
Forecasting high-impact research topics via machine learning on evolving knowledge graphs
The exponential growth in scientific publications poses a severe challenge for human researchers. It forces attention to more narrow sub-fields, which makes it challenging to discover new impactful research ideas and collaborations outside one's own field. While there are ways to predict a scientific paper's future citation counts, they need the research to be finished and the paper written, usually assessing impact long after the idea was conceived. Here we show how to predict the impact of onsets of ideas that have never been published by researchers. For that, we developed a large evolving knowledge graph built from more than 21 million scientific papers. It combines a semantic network created from the content of the papers and an impact network created from the historic citations of papers. Using machine learning, we can predict the dynamic of the evolving network into the future with high accuracy, and thereby the impact of new research directions. We envision that the ability to predict the impact of new ideas will be a crucial component of future artificial muses that can inspire new impactful and interesting scientific ideas.
Tree Optimization Based Heuristics and Metaheuristics in Network Construction Problems
Averbakh, Igor, Pereira, Jordi
We consider a recently introduced class of network construction problems where edges of a transportation network need to be constructed by a server (construction crew). The server has a constant construction speed which is much lower than its travel speed, so relocation times are negligible with respect to construction times. It is required to find a construction schedule that minimizes a non-decreasing function of the times when various connections of interest become operational. Most problems of this class are strongly NP-hard on general networks, but are often tree-efficient, that is, polynomially solvable on trees. We develop a generic local search heuristic approach and two metaheuristics (Iterated Local Search and Tabu Search) for solving tree-efficient network construction problems on general networks, and explore them computationally. Results of computational experiments indicate that the methods have excellent performance.
Efficient Representation Learning Using Random Walks for Dynamic Graphs
Sajjad, Hooman Peiro, Docherty, Andrew, Tyshetskiy, Yuriy
An important part of many machine learning workflows on graphs is vertex representation learning, i.e., learning a low-dimensional vector representation for each vertex in the graph. Recently, several powerful techniques for unsupervised representation learning have been demonstrated to give the state-of-the-art performance in downstream tasks such as vertex classification and edge prediction. These techniques rely on random walks performed on the graph in order to capture its structural properties. These structural properties are then encoded in the vector representation space. However, most contemporary representation learning methods only apply to static graphs while real-world graphs are often dynamic and change over time. Static representation learning methods are not able to update the vector representations when the graph changes; therefore, they must re-generate the vector representations on an updated static snapshot of the graph regardless of the extent of the change in the graph. In this work, we propose computationally efficient algorithms for vertex representation learning that extend random walk based methods to dynamic graphs. The computation complexity of our algorithms depends upon the extent and rate of changes (the number of edges changed per update) and on the density of the graph. We empirically evaluate our algorithms on real world datasets for downstream machine learning tasks of multi-class and multi-label vertex classification. The results show that our algorithms can achieve competitive results to the state-of-the-art methods while being computationally efficient.
GraphGAN: Graph Representation Learning With Generative Adversarial Nets
Wang, Hongwei (Shanghai Jiao Tong University) | Wang, Jia (The Hong Kong Polytechnic University) | Wang, Jialin (Huazhong University of Science and Technology) | Zhao, Miao (The Hong Kong Polytechnic University) | Zhang, Weinan (Shanghai Jiao Tong University) | Zhang, Fuzheng (Microsoft Research Asia) | Xie, Xing (Microsoft Research Asia) | Guo, Minyi (Shanghai Jiao Tong University)
The goal of graph representation learning is to embed each vertex in a graph into a low-dimensional vector space. Existing graph representation learning methods can be classified into two categories: generative models that learn the underlying connectivity distribution in the graph, and discriminative models that predict the probability of edge existence between a pair of vertices. In this paper, we propose GraphGAN, an innovative graph representation learning framework unifying above two classes of methods, in which the generative model and discriminative model play a game-theoretical minimax game. Specifically, for a given vertex, the generative model tries to fit its underlying true connectivity distribution over all other vertices and produces "fake" samples to fool the discriminative model, while the discriminative model tries to detect whether the sampled vertex is from ground truth or generated by the generative model. With the competition between these two models, both of them can alternately and iteratively boost their performance. Moreover, when considering the implementation of generative model, we propose a novel graph softmax to overcome the limitations of traditional softmax function, which can be proven satisfying desirable properties of normalization, graph structure awareness, and computational efficiency. Through extensive experiments on real-world datasets, we demonstrate that GraphGAN achieves substantial gains in a variety of applications, including link prediction, node classification, and recommendation, over state-of-the-art baselines.
GraphGAN: Graph Representation Learning with Generative Adversarial Nets
Wang, Hongwei, Wang, Jia, Wang, Jialin, Zhao, Miao, Zhang, Weinan, Zhang, Fuzheng, Xie, Xing, Guo, Minyi
The goal of graph representation learning is to embed each vertex in a graph into a low-dimensional vector space. Existing graph representation learning methods can be classified into two categories: generative models that learn the underlying connectivity distribution in the graph, and discriminative models that predict the probability of edge existence between a pair of vertices. In this paper, we propose Graph-GAN, an innovative graph representation learning framework unifying above two classes of methods, in which the generative model and discriminative model play a game-theoretical minimax game. Specifically, for a given vertex, the generative model tries to fit its underlying true connectivity distribution over all other vertices and produces "fake" samples to fool the discriminative model, while the discriminative model tries to detect whether the sampled vertex is from ground truth or generated by the generative model. With the competition between these two models, both of them can alternately and iteratively boost their performance. Moreover, when considering the implementation of generative model, we propose a novel graph softmax to overcome the limitations of traditional softmax function, which can be proven satisfying desirable properties of normalization, graph structure awareness, and computational efficiency. Through extensive experiments on real-world datasets, we demonstrate that Graph-GAN achieves substantial gains in a variety of applications, including link prediction, node classification, and recommendation, over state-of-the-art baselines.
Scalable Graph Embedding for Asymmetric Proximity
Zhou, Chang (Peking University) | Liu, Yuqiong (Peking University) | Liu, Xiaofei (Alibaba Group) | Liu, Zhongyi (Alibaba Group) | Gao, Jun (Peking University)
Graph Embedding methods are aimed at mapping each vertex into a low dimensional vector space, which preserves certain structural relationships among the vertices in the original graph. Recently, several works have been proposed to learn embeddings based on sampled paths from the graph, e.g., DeepWalk, Line, Node2Vec. However, their methods only preserve symmetric proximities, which could be insufficient in many applications, even the underlying graph is undirected. Besides, they lack of theoretical analysis of what exactly the relationships they preserve in their embedding space. In this paper, we propose an asymmetric proximity preserving (APP) graph embedding method via random walk with restart, which captures both asymmetric and high-order similarities between node pairs. We give theoretical analysis that our method implicitly preserves the Rooted PageRank score for any two vertices. We conduct extensive experiments on tasks of link prediction and node recommendation on open source datasets, as well as online recommendation services in Alibaba Group, in which the training graph has over 290 million vertices and 18 billion edges, showing our method to be highly scalable and effective.